Application checkpointing

Results: 180



#Item
141Application checkpointing / Computer cluster / System call / Unix / Operating system / Dynamic loading / Single system image / Computing / Fault-tolerant computer systems / Computer architecture

Comments on “Transparent User-Level Process Checkpoint and Restore for Migration” by Bozyigit and Wasiq Felix Rauch, Thomas M. Stricker Laboratory for Computer Systems ETH - Swiss Institute of Technology CH-8092 Z¨u

Add to Reading List

Source URL: www.cs.inf.ethz.ch

Language: English - Date: 2002-08-21 05:33:58
142Application programming interfaces / Fault-tolerant computer systems / Message Passing Interface / MPICH / Open MPI / Computer cluster / K computer / Exascale computing / Application checkpointing / Computing / Concurrent computing / Parallel computing

REDUNDANT EXECUTION OF HPC APPLICATIONS WITH MR-MPI Christian Engelmann and Swen B¨ohm Computer Science and Mathematics Division Oak Ridge National Laboratory, Oak Ridge, TN, USA email: [removed], bohms@ornl.g

Add to Reading List

Source URL: www.christian-engelmann.info

Language: English - Date: 2011-05-03 17:32:34
143Software quality / Data quality / Quality / Survival analysis / Fault-tolerant system / Soft error / Application checkpointing / Fault-tolerant design / Reliability engineering / Computing / Fault-tolerant computer systems / Systems engineering

Addressing Failures in Exascale Computing ⇤ Marc Snir Robert W. Wisniewski Jacob A. Abraham Sarita V Adve Saurabh Bagchi Pavan Balaji

Add to Reading List

Source URL: www.mcs.anl.gov

Language: English - Date: 2013-05-15 16:15:22
144Information / Transaction processing / Data management / Application checkpointing / Rollback / Fault-tolerant system / Database transaction / Communications protocol / Distributed computing / Fault-tolerant computer systems / Data / Computing

A Survey of Rollback-Recovery Protocols in Message-Passing Systems Mootaz Elnozahy* Yi-Min Wang‡ Lorenzo Alvisi†

Add to Reading List

Source URL: www.cs.rice.edu

Language: English - Date: 2007-08-22 00:57:57
145Distributed computing architecture / Data synchronization / Distributed shared memory / Replication / TreadMarks / Application checkpointing / Page fault / Algorithms for Recovery and Isolation Exploiting Semantics / Parallel computing / Computing / Fault-tolerant computer systems / Computer architecture

Tech. Rep[removed]Using Peer Support to Reduce Fault-Tolerant Overhead in Distributed Shared Memories Galen C. Hunt Michael L. Scott y

Add to Reading List

Source URL: www.cs.rochester.edu

Language: English - Date: 2011-04-01 15:39:44
146Reliability engineering / Failure / Survival analysis / Parallel computing / Application checkpointing / Mean time between failures / Redundancy / Failure rate / Computer cluster / Fault-tolerant computer systems / Computing / Systems engineering

Combining Partial Redundancy and Checkpointing for HPC ‡ James Elliott∗ , Kishor Kharbas∗ , David Fiala∗ , Frank Mueller∗ , Kurt Ferreira† and Christian Engelmann‡ ∗ North Carolina State University, Rale

Add to Reading List

Source URL: www.christian-engelmann.info

Language: English - Date: 2012-03-22 20:59:59
147Live migration / Replication / Application checkpointing / Xen / Hyper-V / Virtual machine / Backup / Hypervisor / Disk mirroring / System software / Software / Fault-tolerant computer systems

Remus: High Availability via Asynchronous Virtual Machine Replication Brendan Cully, Geoffrey Lefebvre, Dutch Meyer, Mike Feeley, Norm Hutchinson, and Andrew Warfield∗ Department of Computer Science The University of B

Add to Reading List

Source URL: www.cs.ubc.ca

Language: English - Date: 2009-08-25 18:11:21
148Science / Systems engineering / United States Department of Energy National Laboratories / Fault-tolerant system / Software / Application checkpointing / Fault injection / Resilience / Psychological resilience / Computing / Fault-tolerant computer systems / Software quality

Fault Management Workshop Final Report August 13, 2012 U.S. Department of Energy Fault Management Workshop BWI Airport Marriott, Maryland June 6, 2012

Add to Reading List

Source URL: science.energy.gov

Language: English - Date: 2012-12-10 09:26:24
149Memory management / Data types / Primitive types / Pointer / Software bugs / Application checkpointing / Call stack / Stack / C / Computing / Software engineering / Computer programming

Compiler Technology for Portable Checkpoints Volker Strumpen Laboratory for Computer Science Massachusetts Institute of Technology Cambridge, MA[removed]removed]

Add to Reading List

Source URL: supertech.csail.mit.edu

Language: English - Date: 2014-09-16 08:27:50
150Application checkpointing / Fault-tolerant system / Compiler / Io / Fault-tolerant computer systems / Computing / Software engineering

Portable Fault-Tolerant File I/O by Igor B. Lyubashevskiy Submitted to the Department of Electrical Engineering and Computer Science in partial ful llment of the requirements for the degrees of

Add to Reading List

Source URL: supertech.csail.mit.edu

Language: English - Date: 2014-09-16 08:27:50
UPDATE